Goto

Collaborating Authors

 change factor





Factored Adaptation for Non-Stationary Reinforcement Learning

Neural Information Processing Systems

Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.





Factored Adaptation for Non-Stationary Reinforcement Learning

Neural Information Processing Systems

Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes.


Factored Adaptation for Non-Stationary Reinforcement Learning

Feng, Fan, Huang, Biwei, Zhang, Kun, Magliacane, Sara

arXiv.org Artificial Intelligence

Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.


Decision-making and Fuzzy Temporal Logic

Nascimento, José Cláudio do

arXiv.org Artificial Intelligence

This paper shows that the fuzzy temporal logic can model figures of thought to describe decision-making behaviors. In order to exemplify, some economic behaviors observed experimentally were modeled from problems of choice containing time, uncertainty and fuzziness. Related to time preference, it is noted that the subadditive discounting is mandatory in positive rewards situations and, consequently, results in the magnitude effect and time effect, where the last has a stronger discounting for earlier delay periods (as in, one hour, one day), but a weaker discounting for longer delay periods (for instance, six months, one year, ten years). In addition, it is possible to explain the preference reversal (change of preference when two rewards proposed on different dates are shifted in the time). Related to the Prospect Theory, it is shown that the risk seeking and the risk aversion are magnitude dependents, where the risk seeking may disappear when the values to be lost are very high.